Regression Functions
Regression functions allow users to compare two or more metrics across a range of items to see the relationship of one (dependent) metric against the other metrics. The Pyramid regression functions have been designed to calculate trend regressions using four 'linear' models using a the data points in the underlying query. While regressions are usually shown as lines in charts (often referred to as 'trend' lines), the function is materialized as a series of calculated values for each item in the data set that can be viewed in almost in visualization context - like any other measure.
Note: Regression values added to a visualization are also reflected in the legend, which shows the trend's regression equation as well as its R-Squared value.
Types of Regression Functions
The logic used in regression analysis uses the following four linear regression models:
- LINEAR using the formulation
- POWER using the formulation
- EXPONENTIAL using the formulation
- LOGARITHMIC using the formulation
Where is the slope of the trend line,
is the intercept,
is the independent variable and
is the dependent variable.
Using the Regression Functions
To use the Regression functions, you will need to click Regression > <Regression Model> > <Value / Measure> from the Query ribbon.
Basic Regressions
For a basic regression, you need to pick one of the regression models from the drop-down, and then select the values or measures in the current query from the sub-menu:
Selecting one of the value measures generates a new value chip that builds a simple regression of that measure in the context of your query and automatically adds it to the drop zones for your visual. The following example uses purple arrows to highlight the added items:
Once you have that chip, you are free to move it to any other drop zone like any other metric in your data model. In the example below, the visual has been changed to a column chart (purple arrow) which shows actual sales with the regression chip used in the secondary axis (green arrow) to drive a line chart - effectively showing a basic trend line for the given data set.
Multi-measure Regression
If the visual is a scatter or bubble chart and there are two measures drawn in the query (on the X and Y values axis), the tools allow you to drive a dependent / independent variable regression (shown below). This can also be accomplished using the context menu driven regression tools described below on any visual beyond scatter plots.
Removing Regression Values
To exclude regression logic from your query, simply remove the orange value chips from the drop zones.
Context Menu driven Regressions
Instead of using the ribbon regression tool, you can also use the context calculation menus to build regressions on one or more measures. Unlike the basic ribbon technique, the context menu option allows for regressions across two or more measures (dependent and independent variables) - offering more sophisticated logic. This is similar to the scatter plot option described above; except it will operate in any visual format.
- Click here for more information
Explanations
Applying a regression calculation to the query auto-generates an explanation that you can show in your visual using the Notes tool. This explanation describes how the regression was calculated; it contains the calculation name, the regression function used, the formula used to draw the regression, the R² value, and an explanation of what the R² value is (how closely the data matches the regression).
To view the auto-generated explanation, click Show Notes from the Design ribbon. The explanations can also be viewed downstream in presentations.
In the example below, the explanation describes how the best fit regression calculation was evaluated for the query: